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SEQUENCE LISTING 



) GENERAL INFORMATION: 

(i) APPLICANT: LOWE , JOHN B. 

(ii) TITLE OF INVENTION: METHODS AND PRODUCTS FOR THE SYNTHESIS 
OF OLIGOSACCHARIDE STRUCTURES ON GLYCOPROTEINS , 
GLYCOLIPIDS, OR AS FREE MOLECULES, AND FOR THE ISOLATION 
OF CLONED GENETIC SEQUENCES THAT DETERMINE THESE STRUCTU 

(iii) NUMBER OF SEQUENCES: 14 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: OBLON, SPIVAK, MCCLELLAND, MAIER & NEUSTADT, 

P.C. 

(B) STREET: 17 55 Jefferson Davis Highway, Fourth Floor 

(C) CITY: Arlington 

(D) STATE: Virginia 

(E) COUNTRY: U.S.A. 

(F) ZIP: 22202 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 20-JUL-1992 

(C) CLASSIFICATION: 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Lavalleye, Jean-Paul M. P. 

(B) REGISTRATION NUMBER: 31,451 

(C) REFERENCE / DOCKET NUMBER: 2363-060-55 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (703)521-4500 

(B) TELEFAX: (703)486-2347 

(C) TELEX: 248855 OPAT UR 



2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2043 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: cDNA 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 





AGGAAACCTG 


CCATGGCCTC 


CTGGTGAGCT 


GTCCTCATCC 


ACTGCTCGCT 


GCCTCTCCAG 


60 




ATACTCTGAC 


CCATGGATCC 


CCTGGGTGCA GCCAAGCCAC AATGGCCATG 


GCGCCGCTGT 


120 




CTGGCCGCAC 


TGCTATTTCA 


GCTGCTGGTG 


GCTGTGTGTT 


TCTTCTCCTA 


CCTGCGTGTG 


180 




TCCCGAGACG 


ATGCCACTGG 


ATCCCCTAGG 


GCTCCCAGTG 


GGTCCTCCCG 


ACAGGACACC 


24C 




ACTCCCACCC 


GCCCGACCCT 


CCTGATCCTG 


CTATGGACAT 


GGCCTTTCCA 


CATCCCTGTG 


30C 




GCTCTGTCCC 


GCTGTTCAGA 


GATGGTGCCC 


GGCACAGCCG 


ACTGCCACAT 


CACTGCCGAC 


36C 


M 

■s? ? 


CGCAAGGTGT 


ACCCACAGGC 


AGACACGGTC 


ATCGTGCACC 


ACTGGGATAT 


CATGTCCAAC 


42C 


= % 

.5 


CCTAAGTCAC 


GCCTCCCACC 


TTCCCCGAGG 


CCGCAGGGGC 


AGCGCTGGAT 


CTGGTTCAAC 


48C 


Jl 


TTGGAGCCAC 


CCCCTAACTG 


CCAGCACCTG- 


GAAGCCCTGG- 


-ACAGATACTT-jCAATCTCACC 


54C 


□ 


ATGTCCTACC 


GCAGCGACTC 


CSACATCTTC 


ACGCCCTACG 


GCTGGCTGGA 


GCCGTGGTCC 


60( 




GGCCAGCCTG 


CCCACCCACC 


GCTCAACCTC 


TCGGCCAAGA 


CCGAGCTGGT 


GGCCTGGGCG 


66< 


'Z 


GTGTCCAACT 


GGAAGCCGGA 


CTCAGCCAGG 


GTGCGCTACT 


ACCAGAGCCT 


GCAGGCTCAT 


72< 




CTCAAGGTGG 


ACGTGTACGG 


ACGCTCCCAC 


AAGCCCCTGC 


C C AAGGGG AC 


CATGATGGAG 


78( 




ACGCTGTCCC 


GGTACAAGTT 


CIACCTGGCC 


TTCGAGAACT 


CCTTGCACCC 


CGACTACATC 


34< 




ACCGAGAAGC 


TGTGGAGGAA 


CGCCCTGGAG 


GCCTGGGCCG 


TGCCCGTGGT 


GCTGGGCCCC 


901 




AGCAGAAGCA 


ACTACGAGAG 


GTTCCTGCCA CCCGACGCCT TCATCCACGT GGACGACTTC 


96i 




CAGAGCCCCA 


AGGACCTGGC 


CCGGTACCTG 


CAGGAGCTGG" ACAAGGACCA CGCCCGCTACT 


102 




CTGAGCTACT 


TTCGCTGGCG 


GGAGACGCTG 


CGGCCTCGCT 


CCTTCAGCTG 


GGCACTGGAT 


108 




TTCTGCAAGG 


CCTGCTGGAA ACTGCAGCAG 


GAATCCAGGT 


ACCAGACGGT 


GCGCAGCATA 


114 




GCGGCTTGGT 


TCACCTGAGA 


GGCCGGCATG 


GTGCCTGGGC 


TGCCGGGAAC 


CTCATCTGCC 


120 




TGGGGCCTCA CCTGCTGGAG TCCTTTGTGG 


CCAACCCTCT 


CTCTTACCTG 


GGACCTCACA 


126 




CGCTGGGCTT 


CACGGCTGCC 


AGGAGCCTCT 


CCCCTCCAGA 


AGACTTGCCT 


GCTAGGGACC 


132 
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TCGCCTGCTG GGGACCTCGC CTGTTGGGGA CCTCACCTGC TGGGGACCTC ACCTGCTGGG 13 8{ 

GACCTTGGCT GCTGGAGGCT GCACCTACTG AGGATGTCGG CGGTCGGGGA CTTTACCTGC 14 4 ( 

TGGGACCTGC TCCCAGAGAC CTTGCCACAC TGAATCTCAC CTGCTGGGGA CCTCACCCTG 150< 

GAGGGCCCTG GGCCCTGGGG AACTGGCTTA CTTGGGGCCC CACCCGGGAG TGATGGTTCT 15 6< 

GGCTGATTTG TTTGTGATGT TGTTAGCCGC CTGTGAGGGG TGCAGAGAGA TCATCACGGC 162« 

ACGGTTTCCA GATGTAATAC TGCAAGGAAA AATGATGACG TGTCTCCTCA CTCTAGAGGG 168* 

GTTGGTCCCA TGGGTTAAGA GCTCACCCCA GGTTCTCACC TCAGGGGTTA AGAGCTCAGA 174* 

GTTCAGACAG GTCCAAGTTC AAGCCCAGGA CCACCACTTA TAGGGTACAG GTGGGATCGA 180< 

CTGTAAATGA GGACTTCTGG AACATTCCAA ATATTCTGGG GTTGAGGGAA ATTGCTGCTG 186' 

TCTACAAAAT GCCAAGGGTG GACAGGCGCT GTGGCTCACG CCTGTAATTC CAGCACTTTG 192 

GGAGGCTGAG GTAGGAGGAT TGATTGAGGC CAAGAGTTAA AGACCAGCCT GGTCAATATA 198 

GCAAGACCAC GTCTCTAAAT AAAAAATAAT AGGCCGGCCA GGAAAAAAAA AAAAAAAAAA 204 

AAA _ .-. — • " — 204 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 61 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asp Pro Leu Gly Ala Ala Lys Pro Gin Trp Pro Trp Arg Arg Cys 
15 10 15 

Leu Ala Ala Leu Leu Phe Gin Leu Leu Val Ala Val Cys Phe Phe Ser 
20 25 30 

Tyr Leu Arg Val Ser Arg Asp Asp Ala Thr' Gly Ser Pro Arg Ala Pro 
35 40 45 

Ser Gly Ser Ser Arg Gin Asp Thr Thr Pro Thr Arg Pro Thr Leu Leu 
50 55 60 
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Ile Leu Leu Trp Thr Trp Pro Phe His He Pro Val Ala Leu Ser Arg 
65 70 75 80 

Cys Ser Glu Met Val Pro Gly Thr Ala Asp Cys His He Thr Ala Asp 

85 90 95 

Arg Lys Val Tyr Pro Gin Ala Asp Thr Val He Val His His Trp Asp 
100 105 HO 

He Met Ser Asn Pro Lys Ser Arg Leu Pro Pro Ser Pro Arg Pro Gin 
115 120 125 

Gly Gin Arg Trp He Trp Phe Asn Leu Glu Pro Pro Pro Asn cys Gin 
130 135 140 

His Leu Glu Ala Leu Asp Arg Tyr Phe Asn Leu Thr Met Ser Tyr Arg 
145 150 155 160 

Ser Asp Ser Asp He Phe Thr Pro Tyr Gly Trp Leu Glu Pro Trp Ser 

165 170 175 

Gly Gin Pro Ala His Pro Pro Leu Asn Leu Ser Ala Lys Thr Glu Leu 
180 185 190 

Val Ala Trp Ala Val Ser Asn Trp Lys Pro Asp Ser Ala Arg Val Arg 
195 200 205 

Tyr Tyr Gin Ser Leu Gin Ala His Leu Lys Val Asp Val Tyr Gly Arg 
210 215 220 

Ser His Lys Pro Leu Pro Lys Gly Thr Met Met Glu Thr Leu Ser Arg 
225 230 235 240 

Tyr Lys Phe Tyr Leu Ala Phe Glu Asn Ser Leu His Pro Asp Tyr He 

245 250 255 

Thr Glu Lys Leu Trp Arg Asn Ala Leu Glu Ala Trp Ala Val Pro Val 
260 265 270 

Val Leu Gly Pro ser Arg Ser Asn Tyr Glu Arg Phe Leu Pro Pro Asp 
275 280 285 

Ala ?5? Ile His Val As P Phe Gln s er Pro Lys Asp Leu Ala Arg 

290 295 300 

Tyr Leu Gin Glu Leu Asp Lys Asp His Ala Arg Tyr Leu Ser Tyr Phe 
305 310 315 320 

Arg Trp Arg Glu Thr Leu Arg Pro Arg Ser Phe Ser Trp Ala Leu Asp 

325 330 335 
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Phe cys Lys Ala Cys Trp Lys Leu Gin Gin Glu Ser Arg Tyr Gin Thr 
340 345 350 

Val Arg Ser lie Ala Ala Trp Phe Thr 
355 360 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 
(iv) ANTI-SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCTTCCCTTG TAGACTCTTC TTGGAATGAG AAGTACCGAT TCTGCTGAAG ACCTCGCGCT Si 

CTCAGGCTCT GGGAGTTGGA ACCCTGTACC TTCCTTTCCT CTGCTGAGCC CTGCCTCCTT 12C 

AGGCAGGCCA GAGCTCGACA GAACTCGGTT G C TTTGCTGT TTG C TT T GGA GGGAACACAG 18C 

CTGACGATGA GGCTGACTTT GAACTCAAGA GATCTGCTTA CCCCAGTCTC CTGGAATTAA 24( 

AGGCCTGTAC TACATTTGCC TGGACCTAAG ATTTTCATGA TCACTATGCT TCAAGATCTC 30t 

CATGTCAACA AGATCTCCAT GTCAAGATCC AAGTCAGAAA CAAGTCTTCC ATCCTCAAGA 36< 

TCTGGATCAC AGGAGAAAAT AATGAATGTC AAGGGAAAAG TAATCCTGTT GATGCTGATT 42( 

GTCTCAACCG TGGTTGTCGT GTTTTGGGAA TATGTCAACA GAATTCCAGA GGTTGGTGAG 48 C 

AACAGATGGC AGAAGGACTG GTGGTTCCCA AGCTGGTTTA AAAATGGGAC CCACAGTTAT 54( 

CAAGAAGACA ACGTAGAAGG ACGGAGAGAA AAGGGTAGAA ATGGAGATCG CATTGAAGAG 60( 

CCTCAGCTAT GGGACTGGTT CAATCCAAAG AACCGCCCGG ATGTTTTGAC AGTGACCCCG 66< 

TGGAAGGCGC CGATTGTGTG GGAAGGCACT TATGACACAG CTCTGCTGGA AAAGTACTAC 72t 

GCCACACAGA AACTCACTGT GGGGCTGACA GTGTTTGCTG TGGGAAAGTA CATTGAGCAT 78< 

TACTTAGAAG ACTTTCTGGA GTCTGCTGAC ATGTACTTCA TGGTTGGCCA TCGGGTCATA 84< 

TTTTACGTCA TGATAGACGA CACCTCCCGG ATGCCTGTCG TGCACCTGAA CCCTCTACAT 90( 



• * 



TCCTTACAAG 


TCTTTGAGAT 


CAGGTCTGAG 


AAGAGGTGGC 


AGGATATCAG 


CATGATGCGC 


96< 


ATGAAGACCA 


TTGGGG AG C A 


CATCCTGGCC 


CACATCCAGC 


ACGAGGTCGA 


CTTCCTCTTC 


102* 


TGCATGGACG 


TGGATCAAGT 


CTTTCAAGAC 


AACTTCGGGG 


TGGAAACTCT 


GGGCCAGCTG 


108< 


GTAGCACAGC 


TCCAGGCCTG 


GTGGTACAAG 


GCCAGTCCCG 


AG AAGTT C AC 


CTATGAGAGG 


114* 


CGGGAACTGT 


CGGCCGCGTA 


CATTCCATTC 


GGAGAGGGGG 


ATTTTTACTA 


CCACGCGGCC 


120- 


ATTTTTGGAG 


GAACGCCTAC 


TCACATTCTC 


AACCTCACCA 


GGGAGTGCTT 


TAAGGGGATC 


126< 


CTCCAGGACA 


AGAAACATGA 


CATAGAAGCC 


CAGTGGCATG 


ATGAGAGCCA 


CCTCAACAAA 


132i 


TACTTCCTTT 


TCAACAAACC 


CACTAAAATC 


CTATCTCCAG 


AGTATTGCTG 


GGACTATCAG 


138 


ATAGGCCTGC 


CTTCAGATAT 


TAAAAGTGTC 


AAGGTAGCTT 


GGCAGACAAA 


AGAGTATAAT 


144 


TTGGTTAGAA 


AT AATGT CTG 


ACTTCAAATT 


GTGATGGAAA 


CTTGACACTA 


TTTCTAACCA 


150 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) — LENGTH: 3 9 4_ amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met lie Thr Met Leu Gin Asp Leu His Val Asn Lys He Ser Met Ser 
1 5 10 15 

Arg Ser Lys Ser Glu Thr Ser Leu Pro Ser Ser Arg Ser Gly Ser Gin 
20 25 30 

Glu Lys He Met Asn Val Lys Gly Lys Val He Leu Leu Met Leu He 
35 40 45 

Val Ser Thr Val Val Val Val Phe Trp Glu Tyr Val Asn Arg He Pro 

50 55 60 

Glu Val Gly Glu Asn Arg Trp Gin Lys Asp Trp Trp Phe Pro Ser Trp 
65 70 75 80 

Phe Lys Asn Gly Thr His Ser Tyr Gin Glu Asp Asn Val Glu Gly Arg 

85 90 95 
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Arg Glu Lys Gly Arg Asn Gly Asp Arg lie Glu Glu Pro Gin Leu Trp 
100 105 110 

Asp Trp Phe Asn Pro Lys Asn Arg Pro Asp Val Leu Thr Val Thr Pro 
115 120 125 

Trp Lys Ala Pro He Val Trp Glu Gly Thr Tyr Asp Thr Ala Leu Leu 
130 135 140 

Glu Lys Tyr Tyr Ala Thr Gin Lys Leu Thr Val Gly Leu Thr Val Phe 
145 150 155 160 

Ala Val Gly Lys Tyr lie Glu His Tyr Leu Glu Asp Phe Leu Glu Ser 

165 170 175 

Ala Asp Met Tyr Phe Met Val Gly His Arg Val He Phe Tyr Val Met 
180 185 190 

He Asp Asp Thr Ser Arg Met Pro Val Val His Leu Asn Pro Leu His 
195 200 205 

Ser Leu Gin Val Phe Glu He Arg Ser Glu Lys Arg Trp Gin Asp He 
210 215 220 

_ Ser Met Met Arg_ Me,^ j y^^hr_JLLe-_GJLy^lu_Hi^ He Leu Ala His He 
225 230 235 240 

Gin His Glu Val Asp Phe Leu Phe Cys Met Asp Val Asp Gin Val Phe 

245 250 255 

Gin Asp Asn Phe Gly Val Glu Thr Leu Gly Gin Leu Val Ala Gin Leu 
260 265 270 

Gin Ala Trp Trp Tyr Lys Ala Ser Pro Glu Lys Phe Thr Tyr Glu Arg 
275 280 285 

Arg Glu Leu Ser Ala Ala Tyr He Pro Phe Gly Glu Gly Asp Phe Tyr 
290 295 300 

Tyr His Ala Ala He Phe Gly Gly Thr Pro Thr His He Leu Asn Leu 
305 310 - 315 320 

Thr Arg Glu Cys Phe Lys Gly He Leu Gin Asp Lys Lys His Asp He 

325 330 335 

Glu Ala Gin Trp His Asp Glu Ser His Leu Asn Lys Tyr Phe Leu Phe 
340 345 350 

Asn Lys Pro Thr Lys lie Leu Ser Pro Glu Tyr Cys Trp Asp Tyr Gin 

355 360 365 
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Ile Gly Leu Pro Ser Asp lie Lys Ser Val Lys Val Ala Trp Gin Thr 

370 375 380 

Lys Glu Tyr Asn Leu Val Arg Asn Asn Val 
385 390 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8174 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 





GAATTCCATC 


GTGGCAAGGG 


CAGCCTGAAT 


GGATGATGTA 


ACCTGGGGTC 


CTTTCAATGG 


6C 




AGGG C C AG AC -T C CT-GGGT CT 


AGGGGATGAG 


GGAGGGGAGG 


ATCGGGTTAG 


CTGGGACCCA 


12C 




GGTGAAAGGG 


GCTGGGGGCC 


CACATTCCTG 


AGTCTCAGAG 


AGAAGGATCT 


GGGGTCTCAA 


18C 




GCACCTGAGT 


CGGAGGGAGG 


AGGGGTGCTG 


GGCTCCTGGA 


AAAACCACCT 


CTTGGACCAT 


24C 




CTATGCAGAT 


CACGCAGAAC 


AAGAGAAATT 


TCTGCGCCCC 


ATCTGAATTT 


CTAAGTTTGG 


30C 




GGGGAGGGCG 


TGATCTGACA 


CTGAGGTTCC 


TTGATCCTCA 


GCAAGGCGGC 


AATTGCTGTA 


361 




TGAAAGAAGC 


GACCGCATCT 


GAGACACAAG 


TATCCTGCCT 


TGGAAGCCTC 


TCACCTGGCC 


421 




GTGGGCCAAC 


CTCAACCTCA 


TCTGTCCCTG 


CTCAGATGCT 


CAGACCCTGG 


ACATCCCAGC 


481 




CTCCTCCTCC 


CTGATGCAAT 


CCTGGTGTTT 


CTTTCACCAG 


AGAAGCCATC 


CCAGGCCCAG 


54* 




GCAGGTGCTC 


CTGAAATAAC 


CTGGGGGGAG 


GGGTGGCTGA 


AAGTCCCTGA 


CTGGAGTTGG 


60 




CAGCCAAGCC 


AGGCCCTGGA 


GTGGGCACCC 


AGAGGGAAGA 


CAGGTTGGCT 


AATTTCCTGG 


66 




AGCCCCTAAG 


GGTGCAAGGG 


TAGGCCTTCT 


GTGTCTGAGG 


GAGGAGGGCT 


GGGGCTCTGG 


72 




ACTCCTGGGT 


CTGAGGGAGG 


AGGGGTGGGG 


GGCCTGGACT 


CCTGGGTCTG 


AGGGAGGAGG 


78 




GTCTGGGCCT 


GTACTCCTGG 


ATCTGAGGGA 


GGAGGGGCTG 


GGGAACTTGG 


GCTCCTGGGT 


84 




CTGAGGGAGG 


AGGGAGCTTT 


GGTCTGGACT 


CCTGGGTCTG 


AGGGAGTAGG 


GGCTAGGGAT 


90 



# • 
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CTGGACTCGT 


GGGTGTGAGG AAGGAGGGGC TGGGGTCCTG GACTCCTGGG 


TCTGAGGAAG 


96C 


GAGGGGCAGG 


GGGCTTGGAC TCCTGGGTCT GAGGAAGGAG GGGCCGGGAG 


CCTGGACTCC 


102C 


TAAGTCTGAG 


GGAGGAGGGT CTGGGGGCCT GGACTGCTGG GTGTGAGCAG 


AAGGGTCTGG 


L08C 


GTGCTGGGAG 


TCCCGAGCCT GGGGAGATGA TGGTTAAACT TCTGGGAATC 


AAGT C AAACT 


114C 


CCTGAGTCTT 


TGACATTGAT GTATCTTGAA TGGGAGGGTC AGTCTGTGGG 


GAAGGATTAC 


120C 


CCAGGTGCCG 


AGGCAAGAGA CTGAAGGCAC AAACTGTTTC AGTATAATAA AGAAAATAGT 


126C 


TAGAATAAGA 


ATAGTTATCA TACAAATTAG ATATAGAGAT GATCATGGAC 


AGTATCAATC 


132C 


ATTAGTGTAA 


ACATTATTAA TCATTAGCTA TTACTTTTAT TCTTTGTTGT 


ATAACTAATA 


138( 


TAACCAGGAA 


ACAACCGGTG GGTATAGGGT CAGGTACTGA AGGGACATTG 


TGAGAAGTGA 


144< 


3 CCTAGAAGGC 


AAGAGGTGAG CCTTCTGTCA CACCGGCATA AGGGCCTCTT 


GAGGGCTCCT 


150( 


J TGGTCAAGCG 


GGAACGCCAG TGTCTGGGAA GGCACCCGTT ACTCAGCAGA 


CCACGAAAGG 


156( 


"J GAATCTCCTT 


TTCTTGGAGG AGTCAGGGAA CACTCTGCTC CACCAGCTTC 


TTGTGGGAGG 


162( 


/2 CTGGGTATTA 


JTCTAGilCCTG _CCCGCAGTCA 


-ATGGTCACGC 


168 ( 


TCCTTGTCCT 


CTTGCATTTT CCTCCCGTAC TCCTGGTTCC TCTTTGAAGT 


TCGTAGTAGA 


174< 


TAGCGGTAGA AGAAATAGTG AAAGCCTTTT TTTTTTTTTT TTTGAGGCGG 


AGTCTCGCTC 


180C 


TGTCCCCCAG 


GCTGGAGTGC AGTGGCGTGA TCTCGGCTCA CTGCAATCTC 


CGCCTCCTGG 


186< 


H GTTCACACCA 


TTCTCCTGCC TCACCCTCCC AAATAGCTAG GACTACAGGC 


GCCCTCCACC 


192- 


ACGCGCCCGG 


ATAATTTTTT GTATTTTTAG TAGAGACAGG GTTTCACCGT 


GTTAGCCAGG 


198 


ATGGCCTCCA 


CCTCCTGACC TTGTGATCCG CCCGCCTCAG CCTCCCAAAG 


TGCTGGGATT 


204 


ACAGGCGTGA GCCACCGCGC CCGCCCGAAA TAGTGAAAGT CTTAAAGTCT 


TTGATCTTTC 


210 


TTATAAGTGC 


AGAGAAGAAA ACGCTGACAT ATGCTGCCTT CTCTTTCTGC 


TTCGGCTGCC 


216^ 


TAAAAGGGAA GGGCCCCCTG TCCCATGATC ACGTGACTTG CTTGACCTTA 


TCAGTCATTT 


222 


GGACGACTCA 


CCCTCCTTAT CCTGCCCCCC CTTGTCTTGT ATACAATAAA 


TATCAGCGCG 


228 


CCCAGCCATT 


CGGGGCCACT ACCGGTCTCT GCGTCTTGAT GGTAGTGGTC 


CCCCGGGCCC 


234 


AGCTGTTTTC 


TCTTTATCTC TTTGTCTTGT GTCTTTATTT CTTACAATCT 


CTCCTCTCCT 


240 


CACAGGGGAA 


GAACACCCAC CCGCAAAGCC CCGTAGGGCT GGACCCTACG 


TTAGCCTGCC 


246 
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CTGCTCGGGG TTGGCGATGC TGGAGGTGGG CCTTGGACCA GAGAAAATGC TTTAATTAGG 2520 

TGACAAGCGG GCAGAGGCCT TTGTCTCTGG CGCCGGCAGC CACGGCCCCC GCTGACGGCG 2580 

TGGGAAACAG ACCCTGTTCC ACTCCGGTCT CCAGCCTTGG AATGGTTGCC TTCGTGCAGT 2640 

GCAGGTCTGG AAAGTAGCAG TTTGGCACGG GACCCTAGAA TTCCCCAAAA GGAGTGACTA 2700 

GGGGCTGGGA TTCTGGAATT TGAGTGTGGA CGGTGAGGCG GGGGGTGTGG GAGATCGGAG 2760 

ACCCTGGTGG GCGCGGGAGC ACCTGCAGGC TGGAGGCCCT CGCGCGCTCC GGCGGCAGCC 2820 

TGGCAAACAG GTTCTCCATC CCCCAGGAGG ACGCGGCAGA GGGCGGACGA TCGCTCCACT 2880 

CGCCGGGACC AGGTGCGGGG GCCCTGCCCA GCCGCTGGGG CGTGGCCAGG CTCGAAGCAC 2940 

CCAGGTGTCG GGGGCCGACT CTAAGCCCTG GCACCGGAAG AGAGAGGGCG GCGGATTGGA 300C 

CCTCCCGGCT CCAGCATTGC AACTGGGCGC TCCGTCTCCT GGTCCACGCA ATGATGCTGC 306C 

GGCTGCTCAG AAGCCAGGTA GCCTGCCCTG GGTGAAGCCT TCGCGCAGGT CAATGACGGG 312C 

GCGGAGGGGC AGGGCGCGGT CCCCTGCATC CCCGATCTGG GGAGCGGTGG GCCCAGGGGC 318C 

CATCGCCTTA GCCCCTGGCG CTGGGGCTCG GCGCCAAGTG ACGGGCGGGG CTCCACCTTC 324C 

CAGCCATCCG CCCGGCCCGG GAGGGCGGAC GCTGCGAGAC TCCCGGCCGC GCCCTCTCCT 330C 

TCCTCTCCTC CCCAAGCCCT CGCTGCCAGT CCGGACAGGC TGCGCGGAGG GGAGGGCTGC 336C 

CGGGCCGGAT AGCCGGACGC CTGGCGTTCC AGGGGCGGCC GGATGTGGCC TGCCTTTGCG 342C 

GAGGGTGCGC TCCGGCCACG AAAAGCGGAC TGTGGATCTG CCACCTGCAA GCAGCTCGGG 348C 

TAAGTGGGGA CTGCCCCACT CAGTTGTTCC TGGGACCCAG GAACAACTCC TTCAGAACCA 354( 

GGAGGTGCAC CCCCAACCTC TTCTCCAGGT CTTCCTAAGG CCCTAGGAAT CTCCGCCACC 3 60( 

TCCCCAGCCA TTACTCCTCC AGGAACCAAG ATGCTCCTTC CGCTCCTGAC CCTCCAGCCT 366( 

CTCTTGTTTT ACTTGAACTA TCGTTTCCCA TCACCACCTC" TGTGGTGGAT TTTGCGCCTC 37 2 < 

ACAGACAGGT ACTCCTGAGA AACAGGCTGG TGGAAGAGTC CAGTATCAGC GGAACTTACA 378 1 

GGAGGGGAGA CTCGAGATTC CTTCAGGAAA GGTGTAGGAA CCTGGACCAC TTTCTTTTTT 384< 

TTTTTTTTTT TTTTTTTAAG ACAGGGTCCC T CTCTGTCG C GCAAGCTGGA GTGCAGTCAG 390' 

CGGTGCTATC GCGGCTCATT GTGAGCTCCG GGGATCCTCC CGCCTTAGCA TCCGGTGTAG 396< 

CTGAGACCAC AGACATGTGC CACCATGCCA AGCTAATTTT ATTTATTTTT TTTTGGAGAC 402 1 
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GGAGTTTCAC TCTTGTTGCC CAGGCTGGAG TGTAATGGCA TGATCTCAGC TCACCGCAAC 408C 

TCCCGCCCCC CGGGTTCAGG CGATTCTCCT GCCTCAGCCT CCCGAGTGGC TGGGATTACA 414C 

GGCATGCGCC ACCATGCCCG GCTAATTTTG TATTTTAAGT AGAGACAGGG TTTCTCCACG 420C 

TTGGTCAGGC TGGTCTCGAA CTCCCAACCT CAGGTGATCC ACCCACCTTG GCCTCCCAAA 426C 

GTGCTGGGAT TACAGGTGTG AGCCACCGCG CCTGGCCCAT GCCAAGCTAA TTTTAAAATT 432C 

TTTTTGTAAG AGTGCTCTGT TGCCCAGGCT GATCTTGAAC TCCTGGGCTC AAGGGATCCT 438C 

GCCATCTCAG CCTCCCAATA TGCTGGGATT ACAGGTGTGA GCCACAGTGC CCAGCCAAAC 444C 

CATGGCTATC TTGAAAACCA CTTGTCTTCC AGTCCCCATG CCCCGAAATT CCAAGGCTCT 450C 

CATCCCTGAA ACCTAGGACT CAGGCTCTCC CTACCTCAGC CCCAGGAGTC TAAACCTTTA 456C 

ACTTCCTCTT TCCCTGGGAC TAAGGAGTGC TGCACCCCAG GCGCCTCCCT TACCCCACAT 462C 

CCCTCCTCAG CCTCCCCTCC TCAGCCTCAG TGCATTTGCT AATTCGCCTT TCCTCCCCTG 468C 

CAGCCATGTG GCTCCGGAGC CATCGTCAGC TCTGCCTGGC CTTCCTGCTA GTCTGTGTCC 474C 

TCTC TGTAAT CTTCTTCCTC CATAXCCATC AAGA£AG-CTT^TCCACATC^ 4S0C 

CGATCCTGTG TCCAGACCGC CGCCTGGTGA CACCCCCAGT GGCCATCTTC TGCCTGCCGG 486C 

GTACTGCGAT GGGCCCCAAC GCCTCCTCTT CCTGTCCCCA GCACCCTGCT TCCCTCTCCG 492 C 

GCACCTGGAC TGTCTACCCC AATGGCCGGT TTGGTAATCA GATGGGACAG TATGCCACGC 498C 

TGCTGGCTCT GGCCCAGCTC AACGGCCGCC GGGCCTTTAT CCTGCCTGCC ATGCATGCCG 504C 

CCCTGGCCCC GGTATTCCGC ATCACCCTGC CCGTGCTGGC CCCAGAAGTG GACAGCCGCA 5 IOC 

CGCCGTGGCG GGAGCTGCAG CTTCACGACT GGATGTCGGA GGAGTACGCG GACTTGAGAG 516C 

ATCCTTTCCT GAAGCTCTCT GGCTTCCCCT GCTCTTGGAC TTTCTTCCAC CATCTCCGGG 522C 

AACAGATCCG CAGAGAGTTC ACCCTGCACG ACCACCTTCG GGAAGAGGCG CAGAGTGTGCf 528C 

TGGGTCAGCT CCGCCTGGGC CGCACAGGGG ACCGCCCGCG CACCTTTGTC GGCGTCCACG 534C 

TGCGCCGTGG GGACTATCTG CAGGTTATGC CTCAGCGCTG GAAGGGTGTG GTGGGCGACA 540C 

GCGCCTACCT CCGGCAGGCC ATGGACTGGT TCCGGGCACG GCACGAAGCC CCCGTTTTCG 546C 

TGGTCACCAG CAACGGCATG GAGTGGTGTA AAGAAAACAT CGACACCTCC CAGGGCGATG 552 C 

TGACGTTTGC TGGCGATGGA CAGGAGGCTA CACCGTGGAA AGACTTTGCC CTGCTCACAC 558C 
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AGTGCAACCA 


CACCATTATG 


ACCATTGGCA 


CCTTCGGCTT 


CTGGGCTGCC 


TACCTGGCTG 


564C 


GCGGAGACAC 


TGTCTACCTG 


GCCAACTTCA 


CCCTGCCAGA 


CTCTGAGTTC 


CTGAAGATCT 


570C 


TTAAGCCGGA 


GGCGGCCTTC 


CTGCCCGAGT 


GGGTGGGCAT 


TAATGCAGAC 


TTGTCTCCAC 


576C 


TCTGGACATT 


GGCTAAGCCT 


TGAGAGCCAG 


GGAGACTTTC 


TGAAGTAGCC 


TGATCTTTCT 


582C 


AGAGCCAGCA 


GTACGTGGCT 


TCAGAGGCCT 


GGCATCTTCT 


GGAGAAGCTT 


GTGGTGTTCC 


588C 


TGAAGCAAAT 


GGGTGCCCGT 


ATCCAGAGTG 


ATTCTAGTTG 


GGAGAGTTGG 


AGAGAAGGGG 


594C 


GACGTTTCTG 


GAACTGTCTG 


AATATTCTAG 


AACTAGCAAA 


ACATCTTTTC 


CTGATGGCT6 


600C 


GCAGGCAGTT 


CTAGAAGCCA 


CAGTGCCCAC 


CTGCTCTTCC 


CAGCCCATAT 


CTACAGTACT 


606C 


TCCAGATGGC 


TGCCCCCAGG 


AATGGGGAAC 


TCTCCCTCTG 


GTCTACTCTA 


GAAGAGGGGT 


612C 


TACTTCTCCC 


CTGGGTCCTC 


CAAAGACTGA 


AGGAGCATAT 


GATTGCTCCA 


GAGCAAGCAT 


618C 


TCACCAAGTC 


CCCTTCTGTG 


TTTCTGGAGT 


GATTCTAGAG 


GGAGACTTGT 


TCTAGAGAGG 


624C 


ACCAGGTTTG 


ATGCCTGTGA 


AGAACCCTGC 


AGGGCCCTTA 


TGGACAGGAT 


GGGGTTCTGG 


630C 


AAATC C AG AT AACTAAGGTG AAGAATCTTT-TTAGTTTTTT 


-TTTTTTTTTT -TTGGAGACAG 


636C 



GGTCTCGCTC TGTTGCCCAG GCTGGAGTGC AGTGGCGTGA TCTTGGCTCA CTGCAACTTC 642C 

CGCCTCCTGT GTTCAAGCGA TTCTCCTGTC TCAGCCTCCT GAGTAGATGG GACTACAGGC 648C 

ACAGGCCATT ATGCCTGGCT AATTTTTGTA TTTTTAGTAG AGACAGGGTT TCACCATGTT 654C 

GGCCGGGATG GTCTCGATCT CCTGACCTTG TCATCCACCT GTCTTGGCCT CCCAAAGTGC 660C 

TGGGATTACT GGCATGAGCC ACTGTGCCCA GCCCGGATAT TTTTTTTTAA TTATTTATTT 666C 

ATTTATTTAT TTATTGAGAC GGAGTCTTGC TCTGTAGCCC AGGCCAGAGT GCAGTGGCGC 672C 

GATCTCAGCT CACTGCAAGC TCTGCCTCCC GGGTTCATGC CATTCTGCCT CAGCCTCCTG 678( 

AGTAGCTGGG ACTACAGGCG CCCGCCACCA CGCCCGGCTA ATTTTTTTTG TATTTTTAGT 684C 

AGAGACGGGG TTTCATCGTG TTAACCAGGA TGGTCTCGAT CTCCTGACCT CGTGATCTGC 6901 

CCACCTCGGC CTCCCACAGT GCTGGGATTA CCGGCGTGAG CCACCATGCC TGGCCCGGAT 69 6< 

AATTTTTTTT AATTTTTGTA GAGACGAGGT CTTGTGATAT TGCCCAGGCT GTTCTTCAAC 7021 

TCCTGGGCTC AAGCAGTCCT CCCACCTTGG CCTCCCAGAA TGCTGGGTTT ATAGATGTGA 7 08 1 

GCCAGCACAC CGGGCCAAGT GAAGAATCTA ATGAATGTGC AACCTAATTG TAGCATCTAA 714i 
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TGAATGTTCC ACCATTG CTG GAAAAATTGA GATGGAAAAC AAACCATCTC TAGTTGGCCA 
GCGTCTTGCT CTGTTCACAG TCTCTGGAAA AGCTGGGGTA GTTGGTGAGC AGAGCGGGAC 
TCTGTCCAAC AAGCCCCACA GCCCCTCAAA GACTTTTTTT TGTTTGTTTT GAGCAGACAG 
GCTAAAATGT GAACGTGGGG TGAGGGATCA CTGCCAAAAT GGTACAGCTT CTGGAGCAGA 
ACTTTCCAGG GATCCAGGGA CACTTTTTTT TAAAGCTCAT AAACTGCCAA GAGCTCCATA 
TATTGGGTGT GAGTTCAGGT TGCCTCTCAC AATGAAGGAA GTTGGTCTTT GTCTGCAGGT 
GGGCTGCTGA GGGTCTGGGA TCTGTTTTCT GGAAGTGTGC AGGTATAAAC ACACCCTCTG 
TGCTTGTGAC AAACTGGCAG GTACCGTGCT CATTGCTAAC CACTGTCTGT CCCTGAACTC 
CCAGAACCAC TACATCTGGC TTTGGGCAGG TCTGAGATAA AACGATCTAA AGGTAGGCAG 
ACCCTGGACC CAGCCTCAGA TCCAGGCAGG AGCACGAGGT CTGGCCAAGG TGGACGGGGT 
TGTCGAGATC TCAGGAGCCC CTTGCTGTTT TTTGGAGGGT GAAAGAAGAA ACCTTAAACA 
TAGTCAGCTC TGATCACATC CCCTGTCTAC TCATCCAGAC CCCATGCCTG TAGGCTTATC 
AGGGAGTTAC AGTTACAATT GTTACAGTAC TGTTCCCAAC TCAGCTGCCA CGGGTGAGAG 



AGCAGGAGGT ATGAATTAAA AGTCTACAGC ACTAACCCGT GTCTCTGTAG CTTTTTTGGA 
GCCAGAGCCA CTGTGTATGT GTGTGTGGGT TTGTGTGTGT GTGTGTGTGT GTGTGTGTGT 
AAGAGAGTGG AGGAAAAGGT GGGGTACTTC TGAAGACTTT TATTTTTTTT TAATTAATTT 
ATTTTTTTTC AGAGATCGAG TCTTGCTCTG TGGCCCAGGC TGGAGTGCAG TAGTGTGATC 
TCGGCCCACT GCAA 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 65 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 



720C 
726C 
732C 
738C 
744C 
750C 
756C 
762C 
768C 
774C 
780C 
786C 
J92C 
798C 
804C 
310C 
316( 
817* 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Trp Leu Arg Ser His Arg Gin Leu Cys Leu Ala Phe Leu Leu Val 
15 10 



* • 
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Cys Val Leu Ser Val lie Phe Phe Leu His lie His Gin Asp Ser Phe 
20 25 30 

Pro His Gly Leu Giy Leu Ser lie Leu Cys Pro Asp Arg Arg Leu Val 
35 40 45 

Thr Pro Pro Val Ala lie Phe cys Leu Pro Gly Thr Ala Met Gly Pro 
50 55 60 

Asn Ala Ser Ser Ser Cys Pro Gin His Pro Ala Ser Leu Ser Gly Thr 
65 70 75 80 

Trp Thr Val Tyr Pro Asn Gly Arg Phe Gly Asn Gin Met Gly Gin Tyr 

85 90 95 

Ala Thr Leu Leu Ala Leu Ala Gin Leu Asn Gly Arg Arg Ala Phe lie 
100 105 110 

Leu Pro Ala Met His Ala Ala Leu Ala Pro Val Phe Arg lie Thr Leu 
115 120 125 

Pro Val Leu Ala Pro Glu Val Asp Ser Arg Thr Pro Trp Arg Glu Leu 
130 135 140 

Gin Leu His Asp Trpjtet-Ser Glu G lu Tyr r Ala Asp Leu Arg Asp Pro _ 
—145 150 ~" 155 160 

Phe Leu Lys Leu Ser Gly Phe Pro cys Ser Trp Thr Phe Phe His His 

165 170 175 

Leu Arg Glu Gin lie Arg Arg Glu Phe Thr Leu His Asp His Leu Arg 
180 185 190 

Glu Glu Ala Gin Ser Val Leu Gly Gin Leu Arg Leu Gly Arg Thr Gly 
195 200 205 

Asp Arg Pro Arg Thr Phe Val Gly Val His Val Arg Arg Gly Asp Tyr 
210 215 220 

Leu Gin Val Met Pro Gin Arg Trp Lys Gly Val Val Gly Asp Ser Ala 
225 230 ~ 235 240 

Tyr Leu Arg Gin Ala Met Asp Trp Phe Arg Ala Arg His Glu Ala Pro 

245 250 255 

Val Phe Val Val Thr Ser Asn Gly Met Glu Trp Cys Lys Glu Asn lie 
260 265 270 

Asp Thr Ser Gin Gly Asp Val Thr Phe Ala Gly Asp Gly Gin Glu Ala 
275 280 285 
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Thr Pro Trp Lys Asp Phe Ala Leu Leu Thr Gin Cys Asn His Thr He 
290 295 300 

Met Thr He Gly Thr Phe Gly Phe Trp Ala Ala Tyr Leu Ala Gly Gly 
305 310 315 320 

Asp Thr Val Tyr Leu Ala Asn Phe Thr Leu Pro Asp Ser Glu Phe Leu 

325 330 335 

Lys He Phe Lys Pro Glu Ala Ala Phe Leu Pro Glu Trp Val Gly He 
340 345 350 

Asn Ala Asp Leu Ser Pro Leu Trp Thr Leu Ala Lys Pro 
355 360 365 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 647 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CTGCAGAGAG CGCCACCCGG AAGCCACTTT TATAGAAGCT TTTACACACA ATGCTTGATT 6C 

TTTTTTTTTT TTTTCCGAGA CGGAGTCTCG CTTTGTCGCC CAGGCTGGAG TGCAGTGGCG 12C 

CGATCTGGGC TCACTGCAAG CTCCGCCTCC TGGGTTGACG CCATTCTCCT GCCTCAGCTT 181 

CCCGAGTAGC TGGGACTACA GGCGCCCGCC ACCAAGCCTG GCTAATTTTT TTTTATTTTT 24< 

AGTGGAGACA GAGTTTCACC GTGTTAGCCA GGATGGTCTC "GATCTCCTGA CCTCGGGXTC 30< 

CGCCCGCCTC GGCCTCCCAA AGTGCTGGGA GTATAGGCGT GAGCCACCGC GCCTGGCCTA 36< 

TACTTGATTT TTAATGAAAA CATTCTTAAA TTCATATGGC TAACGCAAAT TTATTTTCTG 42( 

TAGGCATAAC ATCAAAAACA CCTGGCAGGA CTGCCCCATT CCCAGCACTG TCTAGTTCTC 48 

CCCTAGTATC AGTGGGACTC CACTGATGCA CAGCTGTGAT CTACTAAAAC TTCTCTCAAA S4> 

ACTTTCTCCT CTCCTTAGGT CAGCAGCCCC GCCCCTGATC TATTTGGAAA TCCCCTGAAT 60 
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AAAAGTTGAA TATCATAAAC CAAAGCGAAC ACCCAGAAAT TCAAATTCAA CCCGTAGGTA 66 

AAAAATTTCT CAAGTGACTG TAGACGTAGA TGTCTCCAGT GTCGCCTAAT AAGGTAGAAG 72 

AGGCCAGTGC GATACTGTCT TTACACCCTT AACTTGGGTG CTAGAATATT TATCTTCGTC 78 

ATCATTTTAT CATCCAAACT ATTTTGCATA ACTTTCATGG GTGCAGAAAA TGTTTTTTAA 84 

GTGCTTGGTA AAATTAATAG TGATATTCAT TCATTCATCT CACTGAACAG GCAATAAATT 90 

CCTTGACGAC AAGGGCCTTG GGGGGGGCCA CATCTTCATC TTTGGTTTAT GAGTCCTGTG 96 

CGTCTTGGTA CAAGCAATAC TACTATGAGC CGGCAAGTCA GACTTATTTG GTAGGGGACC 102 

AAAGGAAAGA ACATGTTTTG ATTGCTAAGA AAACATTTTG TTCTCTATCC TTTACTGGGC 108 

TGGCAGGCAA AGGAAATGTT CTTATGAGCA CTCACATTGA AAACTTAAGT TCTTCACCAA 114 

□ ATGCAGAGAC TCTGAAGGCC ACGCCGCTGC GGGCTGCCTC CACAATTCGA CCGTCTCGGC 120 

:o GGGCCACGAG ATCCTGGCCA CGGATGCGGT GGCCGCGCCT CTGCTCGCAC GTTCCCCCGG 126 

LJ CCTCTGGACT CCCTCCCTCC CTCAATCCCT CCCTCCGGCG GGCGTCGCTG GCGGGTGGCT 132 

\J AGGCCCAACG GCAGGAAGCC GACGCTATCC TCCGTTCCGC GGCGCCGGGT CCGCCTTCCG 138 

TCTGTTCTAG GGCCTGCTCC TGCGCGGCAG CTGCTTTAGA AGGTCTCGAG CCTCCTGTAC 144 

J S CTTCCCAGGG ATGAACCGGG CCTTCCCTCT GGAAGGCGAG GGTTCGGGCC ACAGTGAGCG 150 

l % AGGGCCAGGG CGGTGGGCGC GCGCAGAGGG AAACCGGATC AGTTGAGAGA GAATCAAGAG 156 

TAGCGGATGA GGCGCTTGTG GGGCGCGGCC CGGAAGCCCT CGGGCGCGGG CTGGGAGAAG 162 

GAGTGGGCGG AGGCGCCGCA GGAGGCTCCC GGGGCCTGGT CGGGCCGGCT GGGCCCCGGG 168 

CGCAGTGGAA GAAAGGGACG GGCGGTGCCC GGTTGGGCGT CCTGGCCAGC TCACCTTGCC 174 

CTGGCGGCTC GCCCCGCCCG GCACTTGGGA GGAGCAGGGC AGGGCCCGCG GCCTTTGCAT 180 

TCTGGGACCG CCCCCTTCCA TTCCCGGGCC AGCGGCGAGC GGCAGCGACG GCTGGAGCCG 186 

CAGCTACAGC ATGAGAGCCG GTGCCGCTCC TCCACGCCTG CGGACGCGTG GCGAGCGGAG 192 
GCAGCGCTGC CTGTTCGCGC CATGGGGGCA CCGTGGGGCT CGCCGACGGC GGCGGCGGGC 198 
GGGCGGCGCG GGTGGCGCCG AGGCCGGGGG CTGCCATGGA CCGTCTGTGT GCTGGCGGCC 204 
GCCGGCTTGA CGTGTACGGC GCTGATCACC TACGCTTGCT GGGGGCAGCT GCCGCCGCTG 210 
CCCTGGGCGT CGCCAACCCC GTCGCGACCG GTGGGCGTGC TGCTGTGGTG GGAGCCCTTC 216 
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GGGGGGCGCG ATAGCGCCCC GAGGCCGCCC CCTGACTGCC CGCTGCGCTT CAACATCAGC 222C 

GGCTGCCGCC TGCTCACCGA CCGCGCGTCC TACGGAGAGG CTCAGGCCGT GCTTTTCCAC 228C 

CACCGCGACC TCGTGAAGGG GCCCCCCGAC TGGCCCCCGC C CTGGGGC AT CCAGGCGCAC 23 4 C 

ACTGCCGAGG AGGTGGATCT GCGCGTGTTG GACTACGAGG AGGCAGCGGC GGCGGCAGAA 240C 

GCCCTGGCGA CCTCCAGCCC CAGGCCCCCG GGCCAGCGCT GGGTTTGGAT GAACTTCGAG 2461 

TCGCCCTCGC ACTCCCCGGG GCTGCGAAGC CTGGCAAGTA ACCTCTTCAA CTGGACGCTC 252C 

TCCTACCGGG CGGACTCGGA CGTCTTTGTG CCTTATGGCT ACCTCTACCC CAGAAGCCAC 258( 

CCCGGCGACC CGCCCTCAGG CCTGGCCCCG CCACTGTCCA GGAAACAGGG GCTGGTGGCA 264( 

TGGGTGGTGA GCCACTGGGA CGACCGCCAG GCCCGGGTCC GCTACTACCA CCAACTGAGC 270< 

g CAACATGTGA CCGTGGACGT GTTCGGCCGG GGCGGGCCGG GGCAGCCGGT GCCCGAAATT 276t 

m GGGCTCCTGC ACACAGTGGC CCGCTACAAG TTCTACCTGG CTTTCGAGAA CTCGCAGCAC 282c 

"*\ CTGGATTATA TCACCGAGAA GCTCTGGCGC AACGCGTTGC TCGCTGGGGC GGTGCCGGTG 288i 

J GTGCTGGGCC CAGACCGTGC CAACTACGAG GCGTTTGTGC CCCGCGGCGC CTTCATCCAC 294* 

y| GTGGACGACT TCCCAAGTGC CTCCTCCCTG GCCTCGTACC TGCTTTTCCT CGACCGCAAC 300< 

CCCGCGGTCT ATCGCCGCTA CTTCCACTGG CGCCGGAGCT ACGCTGTCCA CATCACCTCC 306* 

W TTCTGGGACG AGCCTTGGTG CCGGGTGTGC CAGGCTGTAC AGAGGGCTGG GGACCGGCCC 312 

O AAGAGCATAC GGAACTTGGC CAGCTGGTTC GAGCGGTGAA GCCGCGCTCC CCTGGAAGCG 318^ 

ACCCAGGGGA GCCCAAGTTG TCAGCTTTTT GATCCTCTAC TGTGCATCTC CTTGACTGCC 324 

GCATCATGGG AGTAAGTTCT TCAAACACCC ATTTTTGCTC TATGGGAAAA AAACGATTTA 330 

CCAATTAATA TTACTCAGCA CAGAGATGGG GGCCCGGTTT CCATATTTTT TGCACAGCTA 336 

GCAATTGGGC TCCCTTTGCT GCTGATGGGC ATCATTGTTT AGGGGTGAAG GAGGGGGTTC 342 

TTCCTCACCT TGTAACCAGT GCAGAAATGA AATAGCTTAG CGGCAAGAAG CCGTTGAGGC 348 

GGTTTCCTGA ATTTCCCCAT CTGCCACAGG CCATATTTGT GGCCCGTGCA GCTTCCAAAT 354 

CTCATACACA ACTGTTCCCG ATTCACGTTT TTCTGGACCA AGGTGAAGCA AATTTGTGGT 360 

TGTAGAAGGA GCCTTGTTGG TGGAGAGTGG AAGGACTGTG GCTGCAG 364 



INFORMATION FOR SEQ ZD NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 405 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE; protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Gly Ala Pro Trp Gly Ser Pro Thr Ala Ala Ala Gly Gly Arg Arg 

5 10 15 

Gly Trp Arg Arg Gly Arg Gly Leu Pro Trp Thr Val Cys Val Leu Ala 
20 25 30 



Ala Ala Gly Leu Thr Cys Thr Ala Leu He Thr Tyr Ala Cys Trp Gly 
35 40 45 

Gin Leu Pro Pro Leu Pro Trp Ala Ser Pro Thr Pro Ser Arg Pro Val 
50 55 60 

Gly Val Leu Leu Txp Trp Glu Pro Phe Gly Gly Arg Asp ser Ala Pro 

7 0 75 80 

Arg Pro Pro Pro Asp. Cys Pro Leu Arg Phe Asn He Ser Gly cys Arg 

85 90 95 

Leu Leu Thr Asp Arg Ala Ser Tyr Gly Glu Ala Gin Ala Val Leu Phe 
100 105 110 

His His Arg Asp Leu Val Lys Gly Pro Pro Asp Trp Pro Pro Pro Trp 

115 



120 125 

Gly lie Gin Ala His Thr Ala Glu Glu Val Asp Leu Arg Val Leu Asp 
130 135 140 

Tyr Glu Glu Ala Ala Ala Ala Ala Glu Ala Leu Ala Thr Ser Ser Pro 
145 150 155 160 

Arg Pro Pro Gly Gin. Arg Trp Val Trp Met Asn Phe Glu Ser Pro Ser 

165 170 175 

His ser Pro Gly Leu Arg ser Leu Ala Ser Asn Leu Phe Asn Trp Thr 
I 80 185 190 

Leu ser Tyr Arg Ala Asp Ser Asp Val Phe Val Pro Tyr Gly Tyr Leu 
195 200 205 



-241- 

Tyr Pro Arg Ser His Pro Gly Asp Pro Pro Ser Gly Leu Ala Pro Pro 
210 215 220 

Leu Ser Arg Lys Gin Gly Leu Val Ala Trp Val Val Ser His Trp Asp 
225 230 235 P 240 

Glu Arg Gin Ala Arg Val Arg Tyr Tyr His Gin Leu Ser Gin His Val 

245 250 255 

Thr val Asp val Phe Gly Arg Gly Gly Pro Gly Gin Pro Val Pro Glu 
260 265 270 

He Gly Leu Leu His Thr Val Ala Arg Tyr Lys Phe Tyr Leu Ala Phe 
275 280 285 

Glu Asn Ser Gin His Leu Asp Tyr He Thr Glu Lys Leu Trp Arg Asn 
290 295 300 

Ala Leu Leu Ala Gly Ala Val Pro Val Val Leu Gly Pro Asp Arg Ala 

310 315 320 

Asn Tyr Glu Arg Phe Val Pro Arg Gly Ala Phe He His Val Asp Asp 

325 330 335 

Phe Pro Ser Ala Ser- Ser Leu Ala Ser Tyr Leu Leu Phe Leu Asp Arg 
340 345 350 

Asn pro Ala Val Tyr Arg Arg Tyr Phe His Trp Arg Arg Ser Tyr Ala 
355 360 365 

Val His He Thr Ser Phe Trp Asp Glu Pro Trp Cys Arg Val Cys Gin 
J/0 375 380 

Ala val Gin Arg Ala Gly Asp Arg Pro Lys Ser He Arg Asn Leu Ala 
385 390 395 400 

Ser Trp Phe Glu Arg 

405 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1488 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 



# * 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATGGGGGCAC CGTGGGGCTC GCCGACGGCG GCGGCGGGCG GGCGGCGCGG GTGGCGCCGA 60 

GGCCCGGGGC TGCCATGGAC CGTCTGTGTG CTGGCGGCCG CCGGCTTGAC GTGTACGGCG 120 

CTGATCACCT ACGCTTGCTG GGGGCAGCTG CCGCCGCTGC CCTGGGCGTC GCCAACCCCG 180 

TCGCGACCGG TGGGCGTGCT GCTGTGGTGG GAGCCCTTCG GGGGGCGCGA TAGCGCCCCG 240 

AGGCCGCCCC CTGACTGCTG CTGGGGGCAG CTGCCGCCGC TGCCCTGGGC GTCGCCAACC 300 

CCGTCGCGAC CGGTGGGCGT GCTGCTGTGG TGGGAGCCCT TCGGGGGGCG CGATAGCGCC 36C 

CCGAGGCCGC CCCCTGACTG CCCGCTGCGC TTCAACATCA GCGGCTGCCG CCTGCTCACC 42C 

GACCGCGCGT CCTACGGAGA GGCTCAGGCC GTGCTTTTCC ACCACCGCGA CCTCGTGAAG 48C 

GGGCCCCCCG ACTGGCCCCC GCCCTGGGGC ATCCAGGCGC ACACTGCCGA GCCGCTGCGC 54C 

TTCAACATCA GCGGCTGCCG CCTGCTCACC GACCGCGCGT CCTACGGAGA GGCTCAGGCC 60( 

GTGCTTTTCC ACCACCGCGA CCTCGTGAAG GGGCCCCCCG ACTGGCCCCC GCCCTGGGGC 66< 

ATCCAGGCGC ACACTGCCGA GGAGGTGGAT CTGCGCGTGT TGGACTACGA GGAGGCAGCG 721 

GCGGCGGCAG AAGCCCTGGC GACCTCCAGC CCCAGGCCCC CGGGCCAGCG CTGGGTTTGG 78f 

ATGAACTTCG AGTCGCCCTC GCACTCCCCG GGGCTGCGAA GCCTGGCAAG TAACCTCTTC 84* 

AACTGGACGC TCTCCTACCG GGCGGACTCG GACGTCTTTG TGCCTTATGG CTACCTCTAC 90< 

CCCAGAAGCC ACCCCGGCGA CCCGCCCTCA GGCCTGGCCC CGCCACTGTC CAGGAAACAG 96- 

GGGCTGGTGG CATGGGTGGT GAGCCACTGG GACGAGCGCC AGGCCCGGGT CCGCTACTAC 102 

CACCAACTGA GCCAACATGT GACCGTGGAC GTGTTCGGCC GGGGCGGGCC GGGGCAGCCG 108 

GTGCCCGAAA TTGGGCTCCT GCACACAGTG GCCCGCTACA AGTTCTACCT GGCTTTCGAG 114 

AACTCGCAGC ACCTGGATTA TATCACCGAG AAGCTCTGGC GCAACGCGTT GCTCGCTGGG 120 

GCGGTGCCGG TGGTGCTGGG CCCAGACCGT GCCAACTACG AGCGCTTTGT GCCCCGCGGC 126 

GCCTTCATCC ACGTGGACGA CTTCCCAAGT GCCTCCTCCC TGGCCTCGTA CCTGCTTTTC 132 

CTCGACCGCA ACCCCGCGGT CTATCGCCGC TACTTCCACT GGCGCCGGAG CTACGCTGTC 13 £ 

CACATCACCT CCTTCTGGGA CGAGCCTTGG TGCCGGGTGT GCCAGGCTGT ACAGAGGGCT 144 

GGGGACCGGC CCAAGAGCAT ACGGAACTTG GCCAGCTGGT TCGAGCGG 14 £ 



-243- 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TTTATGACAA GCTGTGTCAT AAATTATAAC AGCTTCTCTC AGGACACTGT GGCCAGGAAG 6C 

TGGGTGATCT TCCTTAATGA CCCTCACTCC TCTCTCCTCT CTTCCCAGCT ACTCTGACCC 12C 

ATGGATCCCC TGGGCCCAGC CAAGCCACAG TGGCTGTGGC GCCGCTGTCT GGCCGGGCTG 18C 

CTGTTTCAGC TGCTGGTGGC TGTGTGTTTC TTCTCCTACC TGCGTGTGTC CCGAGACGAT 24C 

GCCACTGGAT CCCCTAGGCC AGGGCTTATG GCAGTGGAAC CTGTCACCGG GGCTCCCAAT 30C 

GGGTCCCGCT GCCAGGACAG ~CATGGCGACC CCTGCCCACC CCACCCTACT GATCCTGCTG 36C 

TGGACGTGGC CTTTTAACAC ACCCGTGGCT CTGCCCCGCT GCTCAGAGAT GGTGCCCGGC 42C 

GCGGCCGACT GCAACATCAC TGCCGACTCC AGTGTGTACC CACAGGCAGA CGCGGTCATC 48C 

GTGCACCACT GGGATATCAT GTACAACCCC AGTGCCAACC TCCCGCCCCC CACCAGGCCG 54C 

CAGGGGCAGC GCTGGATCTG GTTCAGCATG GAGTCCCCCA GCAACTGCCG GCACCTGGAA 60C 

GCCCTGGACG GATACTTCAA TCTCACCATG TCCTACCGCA GCGACTCCGA CATCTTCACG 66C 
CCCTACGGCT GGCTGGAGCC GTGGTCCGGC CAGCCTGCCC ACCCACCGCT CAACCTCTCG 72 C 

GCCAAGACCG AGCTGGTGGC CTGGGCGGTG TCCAACTGGA AGCCGGACTC GGCCAGGGTG 78C 
CGCTACTACC AGAGCCTGCA GGCTCATCTC AAGGTGGACG TGTACGGACG CTCCCACAAG 84( 
CCCCTGCCCA AGGGGACCAT GATGGAGACG CTGTCCCGGT ACAAGTTCTA TCTGGCCTTC 90< 
GAGAACTCCT TGCACCCCGA CTACATCACC GAGAAGCTGT GGAGGAACGC CCTGGAGGCC 96i 

TGGGCCGTGC CCGTGGTGCT GGGCCCCAGC AGAAGCAACT ACGAGAGGTT CCTGCCGCCC 102* 

GACGCCTTCA TCCACGTGGA TGACTTCCAG AGCCCCAAGG ACCTGGCCCG GTACCTGCAG 1081 

GAGCTGGACA AGGACCACGC CCGCTACCTG AGCTACTTTC GCTGGCGGGA GACGCTGCGG 114 1 



+ # 
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CCTCGCTCCT TCAGCTGGGC ACTGGCTTTC TGCAAGGCCT GCTGGAAGCT GCAGCAGGAA 12QC 

TCCAGGTACC AGACGGTGCG CAGCATAGCG GCTTGGTTCA CCTGAGAGGC CGGCATGGGG 12 6 C 

CCTGGGCTGC CAGGGACCTC ACTTTCCCAG GGCCTCACCT ACCTAGGGTC TCTAGA 13 1C 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Asp Pro Leu Gly Pro Ala Lys Pro Gin Trp Leu Trp Arg Arg Cys 
15 10 15 

Leu Ala Gly Leu Leu Phe Gin Leu Leu Val Ala Val Cys Phe Phe Ser 
20 25 30 

Tyr Leu Arg Val Ser Arg Asp Asp Ala Thr Gly Ser Pro Arg Pro Gly" 
35 40 45 

Leu Met Ala Val Glu Pro Val Thr Gly Ala Pro Asn Gly Ser Arg Cys 
50 55 60 

Gin Asp Ser Met Ala Thr Pro Ala His Pro Thr Leu Leu lie Leu Leu 
65 70 75 80 

Trp Thr Trp Pro Phe Asn Thr Pro Val Ala Leu Pro Arg Cys Ser Glu 

85 90 95 

Met Val Pro Gly Ala Ala Asp Cys Asn lie Thr Ala Asp Ser Ser Val 
100 105 110 

Tyr Pro Gin Ala Asp Ala Val lie Val Hfs His Trp Asp lie Met Tyr 
115 120 125 

Asn Pro Ser Ala Asn Leu Pro Pro Pro Thr Arg Pro Gin Gly Gin Arg 
130 135 140 

Trp lie Trp Phe Ser Met Glu Ser Pro Ser Asn Cys Arg His Leu Glu 
145 150 155 160 

Ala Leu Asp Gly Tyr Phe Asn Leu Thr Met Ser Tyr Arg Ser Asp Ser 

165 170 175 
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Asp lie Phe Thr Pro .Tyr Gly Trp Leu Glu Pro Trp Ser Gly Gin Pro 
180 185 190 

Ala His Pro Pro Lea Asn Leu Ser Ala Lys Thr Glu Leu Val Ala Trp 
195 200 205 

Ala Val Ser Asn Trp Lys Pro Asp Ser Ala Arg Val Arg Tyr Tyr Gin 
210 215 220 

Ser Leu Gin Ala Hi* i Leu Lys Val Asp Val Tyr Gly Arg Ser His Lys 
225 230 235 240 

Pro Leu Pro Lys Gly Thr Met Met Glu Thr Leu Ser Arg Tyr Lys Phe 

245 . 250 255 

Tyr Leu Ala Phe Gin Asn Ser Leu His Pro Asp Tyr lie Thr Glu Lys 
260 265 270 

Leu Trp Arg Asn Ala Leu Glu Ala Trp Ala Val Pro Val Val Leu Gly 
275 280 285 

Pro Ser Arg Ser Asn Tyr Glu Arg Phe Leu Pro Pro Asp Ala Phe lie 
290 295 300 

His val Asp Asp Phe Gin Ser Pro Lys Asp Leu Ala Arg Tyr Leu Gin 
305 310 315 320 

Glu Leu Asp Lys Asp His Ala Arg Tyr Leu Ser Tyr Phe Arg Trp Arg 

325 330 335 

Glu Thr Leu Arg Pro Arg Ser Phe Ser Trp Ala Leu Ala Phe cys Lys 
340 345 350 

Ala cys Trp Lys Lea. Gin Gin Glu Ser Arg Tyr Gin Thr Val Arg Ser 
355 360 365 

lie Ala Ala Trp Phe Thr 
370 

(2) INFORMATION FOR SEQ 3D NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1QB6 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATGGATCCCC TGGGTGCAGC CAAGCCACAA TGGCCATGGC GCCGCTGTCT GGCCGCACTG 6C 

CTATTTCAGC TGCTGGTGGC TGTGTGTTTC TTCTCCTACC TGCGTGTGTC CCGAGACGAT 12 C 

GCCACTGGAT CCCCTAGGGC TCCCAGTGGG TCCTCCCGAC AGGACACCAC TCCCACCCGC 18C 

CCCACCCTCC TGATCCTGCT ATGGACATGG CCTTTCCACA TCCCTGTGGC TCTGTCCCGC 24C 

TGTTCAGAGA TGGTGCCCGG CACAGCCGAC TGCCACATCA CTGCCGACCG CAAGGTGTAC 30C 

CCACAGGCAG ACACGGTCAT CGTGCACCAC TGGGATATCA TGTCCAACCC TAAGTCACGC 36C 

CTCCCACCTT CCCCGAGGCC GCAGGGGCAG CGCTGGATCT GGTTCAACTT GGAGCCACCC 42C 

CCTAACTGCC AGCACCTGGA AGCCCTGGAC AGATACTTCA ATCTCACCAT GTCCTACCGC 48( 

AGCGACTCCG ACATCTTCAC GCCCTACGGC TGGCTGGAGC CGTGGTCCGG CCAGCCTGCC 54( 

CACCCACCGC TCAACCTCTC GGCCAAGACC GAGCTGGTGG CCTGGGCGGT GTCCAACTGG 60C 

AAGCCGGACT CAGCCAGGGT GCGCTACTAC CAGAGCCTGC AGGCTCATCT CAAGGTGGAC 66( 

GTGTACGGAC GCTCCCACAA GCCCCTGCCC AAGGGGACCA TGATGGAGAC GCTGTCCCGG 721 

TACAAGTTCT ACCTGGCCTT CGAGAACTCC TTGCACCCCG ACTACATCAC CGAGAAGCTG 78* 

TGGAGGAACG CCCTGGAGGC CTGGGCCGTG CCCGTGGTGC TGGGCCCCAG CAGAAGCAAC 84< 

TACGAGAGGT TCCTGCCACC CGACGCCTTC ATCCACGTGG ACGACTTCCA GAGCCCCAAG 90< 

GACCTGGCCC GGTACCTGCA GGAGCTGGAC AAGGACCACG CCCGCTACCT GAGCTACTTT 96< 

CGCTGGCGGG AGACGCTGCG GCCTCGCTCC TTCAGCTGGG CACTGGATTT CTGCAAGGCC 102 

TGCTGGAAAC TGCAGCAGGA ATCCAGGTAC CAGACGGTGC GCAGCATAGC GGCTTGGTTC 108 

ACCTGA 108 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1654 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TTTTCTCATC TGTGAAACAG GAATAATAAC AGCTCTTCTC AGGACTCATG GCCTGGAGCT 60 

TTGGTAAGCA GGAGATTGTC ATCAATGACC CTCACTCCTC TCTCCCCACT TCCCAGAGAC 120 

TCTGACCCAT GGATCCCCTG GGCCCGGCCA AGCCACAGTG GTCGTGGCGC TGCTGTCTGA 180 

CCACGCTGCT GTTTCAGCTG CTGATGGCTG TGTGTTTCTT CTCCTATCTG CGTGTGTCTC 240 

AAGACGATCC CACTGTGTAC CCTAATGGGT CCCGCTTCCC AGACAGCACA GGGACCCCCG 300 

CCCACTCCAT CCCCCTGATC CTGCTGTGGA CGTGGCCTTT TAACAAACCC ATAGCTCTGC 360 

CCCGCTGCTC AGAGATGGTG CCTGGCACGG CTGACTGCAA CATCACTGCC GACCGCAAGG 420 

TGTATCCACA GGCAGACGCG GTCATCGTGC ACCACCGAGA GGTCATGTAC AACCCCAGTG 480 

QCCCAGCTCCC ACGCTCCCCG AGGCGGCAGG GGCAGCGATG GATCTGGTTC AGCATGGAGT 540 

fflCCCCAAGCCA CTGCTGGCAG CTGAAAGCCA TGGACGGATA CTTCAATCTC ACCATGTCCT 600 

i'yACCGCAGCGA CTCCGACATC TTCACGCCCT ACGGCTGGCT GGAGCCGTGG TCCGGCCAGC 660 

yCTGCCCACCC ACCGCTCAAC CTCTCGGCCA AGACCGAGCT GGTGGCCTGG GCAGTGTCCA 720 

^ ACTGGGGGCC AAACTCCGCC AGGGTGCGCT ACTACCAGAG CCTGCAGGCC CATCTCAAGG 780 

; i TGGACGTGTA CGGACGCTCC CACAAGCCCC TGCCCCAGGG AACCATGATG GAGACGCTGT 840 

|CCCGGTACAA GTTCTATCTG GCCTTCGAGA ACTCCTTGCA CCCCGACTAC ATCACCGAGA 900 

j=3 AGCTGTGGAG GAACGCCCTG GAGGCCTGGG CCGTGCCCGT GGTGCTGGGC CCCAGCAGAA 960 

GCAACTACGA GAGGTTCCTG CCACCCGACG CCTTCATCCA CGTGGACGAC TTCCAGAGCC 1020 

CCAAGGACCT GGCCCGGTAC CTGCAGGAGC TGGACAAGGA CCACGCCCGC TACCTGAGCT 1080 

ACTTTCGCTG GCGGGAGACG CTGCGGCCTC GCTCCTTCAG CTGGGCACTC GCTTTCTGCA 1140 

AGGCCTGCTG GAAACTGCAG GAGGAATCCA GGTACCAGAC ACGCGGCATA GCGGCTTGGT 1200 

TCACCTGAGA GGCTGGTGTG GGGCCTGGGC TGCCAGGAAC CTCATTTTCC TGGGGCCTCA 1260 

CCTGAGTGGG GGCCTCATCT ACCTAAGGAC TCGTTTGCCT GAAGCTTCAC CTGCCTGAGG 1320 

ACTCACCTGC CTGGGACGGT CACCTGTTGC AGCTTCACCT GCCTGGGGAT TCACCTACCT 13 8 C 

GGGTCCTCAC TTTCCTGGGG CCTCACCTGC TGGAGTCTTC GGTGGCCAGG TATGTCCCTT 144C 

ACCTGGGATT TCACATGCTG GCTTCCAGGA GCGTCCCCTG CGGAAGCCTG GCCTGCTGGG 150 C 
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GATGTCTCCT GGGGACTTTG CCTACTGGGG ACCTCGGCTG TTGGGGACTT TACCTGCTGG 1560 

GACCTGCTCC CAGAGACCTT CCACACTGAA TCTCACCTGC TAGGAGCCTC ACCTGCTGGG 16 2 C 

GACCTCACCC TGGAGGCACT GGGCCCTGGG AACT 1654 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 59 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Asp Pro Leu Gly Pro Ala Lys Pro Gin Trp Ser Trp Arg Cys Cys 
15 10 15 

Leu Thr Thr Leu Leu Phe Gin Leu Leu Met Ala Val Cys Phe Phe Ser 
20 25 30 



Tyr Leu Arg Val Ser Gin Asp Asp Pro Thr Val Tyr Pro Asn Gly Ser 
35 40 45 

Arg Phe Pro Asp Ser Thr Gly Thr Pro Ala His Ser lie Pro Leu lie 
50 55 60 

Leu Leu Trp Thr Trp Pro Phe Asn Lys Pro lie Ala Leu Pro Arg Cys 
65 70 75 30 

Ser Glu Met Val Pro Gly Thr Ala Asp Cys Asn lie Thr Ala Asp Arg 

85 90 95 

Lys Val Tyr Pro Gin Ala Asp Ala Val lie Val His His Arg Glu Val 
100 105 110 

Met Tyr Asn Pro ser Ala Gin Leu Pro Arg Ser Pro Arg Arg Gin Gly 
115 120 125 

Gin Arg Trp lie Trp Phe Ser Met Glu Ser Pro Ser His cys Trp Gin 
130 135 140 

Leu Lys Ala Met Asp Gly Tyr Phe Asn Leu Thr Met Ser Tyr Arg Ser 
145 150 155 160 

Asp Ser Asp lie Phe Thr Pro Tyr Gly Trp Leu Glu Pro Trp Ser Gly 

165 170 175 



« « 
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Gln Pro Ala His Pro Pro Leu Asn Leu Ser Ala Lys Thr Glu Leu Val 
180 185 190 

Ala Trp Ala Val Ser Asn Trp Gly Pro Asn Ser Ala Arg Val Arg Tyr 
195 200 205 

Tyr Gin Ser Leu Gin Ala His Leu Lys Val Asp Val Tyr Gly Arg Ser 
210 215 220 

His Lys Pro Leu Pro Gin Gly Thr Met Met Glu Thr Leu Ser Arg Tyr 
225 230 235 240 

Lys Phe Tyr Leu Ala Phe Glu Asn Ser Leu His Pro Asp Tyr lie Thr 

245 250 255 

Glu Lys Leu Trp Arg Asn Ala Leu Glu Ala Trp Ala Val Pro Val Val 
260 265 270 

Leu Gly Pro Ser Arg Ser Asn Tyr Glu Arg Phe Leu Pro Pro Asp Ala 
275 280 285 

Phe lie His Val Asp Asp Phe Gin Ser Pro Lys Asp Leu Ala Arg Tyr 
290 295 300 

Leu Gin Glu Leu Asp Lys Asp His Ala Arg Tyr Leu Ser Tyr Phe Arg 
305 310 315 320 

Trp Arg Glu Thr Leu Arg Pro Arg Ser Phe Ser Trp Ala Leu Ala Phe 

325 330 335 

cys Lys Ala Cys Trp Lys Leu Gin Glu Glu Ser Arg Tyr Gin Thr Arg 
340 345 350 

Gly He Ala Ala Trp Phe Thr 
355 



